2 research outputs found
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation
The Segment Anything Model (SAM), a foundation model for general image
segmentation, has demonstrated impressive zero-shot performance across numerous
natural image segmentation tasks. However, SAM's performance significantly
declines when applied to medical images, primarily due to the substantial
disparity between natural and medical image domains. To effectively adapt SAM
to medical images, it is important to incorporate critical third-dimensional
information, i.e., volumetric or temporal knowledge, during fine-tuning.
Simultaneously, we aim to harness SAM's pre-trained weights within its original
2D backbone to the fullest extent. In this paper, we introduce a
modality-agnostic SAM adaptation framework, named as MA-SAM, that is applicable
to various volumetric and video medical data. Our method roots in the
parameter-efficient fine-tuning strategy to update only a small portion of
weight increments while preserving the majority of SAM's pre-trained weights.
By injecting a series of 3D adapters into the transformer blocks of the image
encoder, our method enables the pre-trained 2D backbone to extract
third-dimensional information from input data. The effectiveness of our method
has been comprehensively evaluated on four medical image segmentation tasks, by
using 10 public datasets across CT, MRI, and surgical video data. Remarkably,
without using any prompt, our method consistently outperforms various
state-of-the-art 3D approaches, surpassing nnU-Net by 0.9%, 2.6%, and 9.9% in
Dice for CT multi-organ segmentation, MRI prostate segmentation, and surgical
scene segmentation respectively. Our model also demonstrates strong
generalization, and excels in challenging tumor segmentation when prompts are
used. Our code is available at: https://github.com/cchen-cc/MA-SAM